ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / tsql / doc / tsql.mail / 000035_csj@iesd.auc.dk _Sun Mar 14 23:44:40 1993.msg < prev next >

Wrap

Internet Message Format | 1996-01-31 | 9KB

Received: from iesd.auc.dk by optima.cs.arizona.edu (5.65c/15) via SMTP id AA05212; Sun, 14 Mar 1993 15:44:17 MST Received: from yellow.iesd.auc.dk by iesd.auc.dk with SMTP id AA15068 (5.65c8/IDA-1.5/MD for <tsql@cs.arizona.edu>); Sun, 14 Mar 1993 23:44:40 +0100 Date: Sun, 14 Mar 1993 23:44:40 +0100 From: "Christian S. Jensen" <csj@iesd.auc.dk> Message-Id: <199303142244.AA15068@iesd.auc.dk> To: tsql@cs.arizona.edu Subject: The TSQL Benchmark initiative Dear colleague, In a recent posting to this mailing list, Rick Snodgrass formulated a vision of a comprehensive consensus benchmark for temporal query languages, the TSQL Benchmark. I have decided to accept the invitation to coordinate the development of the first version of the TSQL Benchmark because I feel that the benchmark will become an important infra-structural component of temporal databases. As the coordinator, I will try to ensure that progress is being made and that the outcome is faithful to the initial intentions. All that are interested in this topic are invited to participate. This mailing list will be the medium of communication. Three tasks must be accomplished initially. Task 1: Agree on a database schema. Task 2: Agree on an instance of the schema. Task 3: Agree on a suitable taxonomy for the benchmark queries. These tasks will be addressed sequentially during the next weeks. When they are completed, the benchmark will be populated with queries. Below, is my proposal for Task 1. The proposal also includes restrictions of the scope of the benchmark. Comments, suggestions, improvements, etc. are very welcome. Best regards, Christian S. Jensen Aalborg University \documentstyle[11pt]{article} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % This paper is intended to evolve into the first version of the TSQL % benchmark. % The purpose of this draft is to settle on a database schema for the % benchmark. % The purpose of the next draft is to settle on an instance for the % agreed-upon database schema. % The purpose of the following draft is then to define a taxonomy to % be used for categorizing the benchmark queries that will follow. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \addtolength{\textwidth}{1.485in} \setlength{\oddsidemargin}{.1in} \setlength{\evensidemargin}{.1in} \addtolength{\topmargin}{-.85in} \addtolength{\textheight}{1.8in} \newenvironment{prog} { \begin{center} \begin{minipage}{3in} \begin{tabbing} nnnn\=nnnn\=nnnn\=nnnn\=nnnn\=nnnn\=nnnn\=\kill }{\end{tabbing} \end{minipage} \end{center}} \long\def\comment#1{} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % PAPER START %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} \title{\Large\bf The TSQL Benchmark \\ DRAFT} \author{} \date{March 14, 1993} \maketitle \section{Introduction} The central goal of this document is to provide the temporal database community with a {\em comprehensive consensus benchmark} for temporal query languages which is {\em independent} of any existing language proposal. This is not a performance benchmark, but a {\em semantic} benchmark which is intended to be an aid in evaluating the user-friendliness of proposals for temporal query languages. Thus, temporal query languages should ideally be able to express the benchmark queries both conveniently and naturally. To obtain a consensus benchmark, all researchers in temporal databases will be invited to participate in this initiative, and each researcher that has contributed significantly will be a coauthor. The electronic mail distribution {\tt tsql@cs.arizona.edu} is used as the medium for discussing the benchmark and related issues. As a consequence of the central goal above, no existing temporal data models are used or mentioned. The relation schemas of the benchmark are expressed as sets of attributes, with the temporal aspects being implicit (of course, specific temporal data models might add explicit temporal attributes). The contents of the relations are describe in natural language. The benchmark queries are also given only in natural language. While the benchmark is not intended to constitute a metric for query language completeness, it should be comprehensive, to ensure that all aspects of query language design are covered. Within certain boundaries, discussed next, it is thus intended to contain all proposed queries that appear reasonable. First, the benchmark is of a semantic nature---in its current form, it is not aimed at performance comparisons. The intention is to provide a foundation for comparing the descriptive and operational characteristics and capabilities of temporal query languages, not their performance characteristics. Properly extended with additional relation schemas and a variety of large instances, the benchmark can also be used for performance comparisons. Second, a number of restrictions are imposed on which types of queries are admissible in this version of the benchmark, including the following. \begin{itemize} \item{} Queries are restricted to valid time only. Transaction-time related queries are not explored, and comprehensiveness is not intended with respect to user-defined time. \item{} Schema evolution and versioning are not considered. \item{} Incompleteness is not considered. \item{} Recursive queries are not included. \item{} Temporal reasoning is beyond the scope of this version of the benchmark. \item{} For simplicity, each relation is used only once in each query. \item{} Queries involving aggregation facilities are not considered. \item{} Only queries are included---updates are not considered. \end{itemize} These advanced aspects are excluded solely for pragmatic reasons, and the exclusion is not meant to imply in any way that the aspects are not important. The restrictions simply represent an attempt to reduce the size of the initial benchmark to manageable proportions. Finally, it is emphasized that this benchmark is merely the first in a sequence of ever-more comprehensive benchmarks. Later benchmarks may relax the above restrictions on the scope of comprehensiveness imposed on this benchmark. \section{The Benchmark Database Schema} \subsection{Criteria} A suitable database schema for the benchmark should satisfy three criteria. \begin{itemize} \item{} The schema should be simple. This will aid in making the benchmark easy to understand. This criterion restricts the number of relation schemas and the number of attributes of the individual schemas. Additionally, the names of the relations and of the attributes should be short, as they will be referenced repeatedly. When an extension is proposed, the benefits should be carefully compared with the added complexity. \item{} The schema should allow for comprehensiveness within the chosen scope.Using the schema, it should be possible formulate queries of all the types that appear reasonable. This indicates a need for at least two related relation schemas (for natural join queries). \item{} A schema that has already been used frequently is preferred over a new schema. This guarantees that many existing queries can be adapted easily to the benchmark. \end{itemize} \subsection{The Proposed Schema} The proposed database schema consists of two valid-time relation schemas, {\tt Emp} and {\tt Mgr}. In the terminology of the entity-relationship model, the first schema models an entity set, and the second a relationship set. They are defined as follows. Relation {\tt Emp} records relationships between employees, salaries, and departments, and it contains the three attributes, {\tt Name}, {\tt Salary}, and {\tt Dept}. Relation {\tt Mgr} records the association of employees, as managers, with departments, and it contains two attributes, {\tt Dept} and {\tt Manager}. The relation schemas obey the following {\em snapshot} functional dependencies: \begin{prog} For {\tt Emp}: \\ \> {\tt Name} $\rightarrow$ {\tt Salary} \\ \> {\tt Name} $\rightarrow$ {\tt Dept} \\ For {\tt Mgr}: \\ \> {\tt Dept} $\rightarrow$ {\tt Manager} \end{prog} Note that both relation schemas are in snapshot Boyce-Codd normal form. The attribute {\tt Manager} of {\tt Mgr} is a foreign key for the attribute {\tt Name} of {\tt Emp}. Thus, a tuple is allowed to exist in the {\tt Mgr} relation only if, for each non-empty snapshots of this tuple, the {\tt Manager} attribute value exists as a {\tt Name} value of some tuple in the simultaneous snapshot of the {\tt Emp} relation. \end{document}